Modeling and Information Rates for 
Synchronization Error Channels 



Aravind R. Iyengar, Paul H. Siegel, and Jack K. Wolf 
University of California San Diego, La Jolla, CA 92093 - 0401, USA 
Email: {aravind, psiegel, j wolf} @ ucsd.edu 



Abstract — We propose a new channel model for channels with 
synchronization errors. Using this model, we give simple, non- 
trivial and, in some cases, tight lower bounds on the capacity for 
certain synchronization error channels. 

I. Introduction 

Channels with synchronization errors have been of interest 
from the very beginnings of information theory. However, 
little is known of their capacities or of good coding schemes. 
In the last decade, a flurry of activity has led to significant 
progress in estimating achievable information rates for cer- 
tain synchronization error channels (SECs). A "good" coding 
scheme continues to be elusive. 

In this paper, we model an SEC as a channel with states 
and use this model to arrive at some simple lower bounds on 
the capacity. Although the idea behind the alternative model is 
straightforward, the model itself has been absent in literature. 
While the present paper deals only with a few asymptotic 
results on information rates of the SEC, we think that the 
model presented here can be utilized to design codes for SECs 
in general. 

The remainder of this paper is organized as follows. In 
Section |ll] we recall a few of the main results on capacity 
of SECs. We consider a special case of the generic SEC — 
the deletion, duplication channel (DDC) — and construct an 
equivalent channel by viewing the SEC as a channel with 
states in Section [Hi] We use the model to obtain bounds on 
the capacity in Section |IV] We conclude by highlighting the 
possible advantages of the model discussed in Section [V] 

II. Synchronization Error Channels 

Remark 1 (Notation): Non-random variables are written as 
lowercase letters, e.g. n. We denote sets by double-stroke 
uppercase letters, X, and define [n] = {1, 2, • • • , n}, [0] = 
and [m : n] — {m,m + 1, • • • ,n},m < n. We assume 
an underlying probability space (51, P) over which all 
random variables, denoted by uppercase letters X, are defined. 
Random vectors are denoted by uppercase letters with the set 
of indices as subscripts, e.g. X[„] = {Xi,X2, - ■ ■ ,Xn) or 
Xyj^j when the set of indices is itself a random vector Yj„]. 
Random processes are denoted by script letters X, or as Xjq. 

Let X be a finite set. A memoryless synchronization error 
channel is specified by a stochastic matrix {q{y\x),y E 
Y,x G X} where Y is the output alphabet and Y is the 
set of all strings (including the empty string A) over Y. We 
assume that the expected length of the output string arising 



from one input symbol is strictly positive and finite, i.e. 
< J2y<£Y \y\liy\^) < where \y\ denotes the length 
of the string y (the number of symbols in y). For a:[„] = 
{xi,X2,--- ,Xn) e X" and = {yi,y2, ■ ■ ■ e Y", 

we write g„(y[„]|a;[„]) = ll7=l'l{y^\x^)■ Let y[„] denote the 
concatenation of strings y^,i S [n]. Then a memoryless SEC 
Qn is specified by the input alphabet X, output alphabet Y 
and transition probabilities 
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for j7 € Y and X[n] G X". Consider the sequence of channels 
{Qn}'^=i where Qn is as defined above. Then, the following 
was shown by Dobrushin in 1967. 

Theorem 1 (Coding Theorem i[7]?j.- Let X[„] and Y denote 
the input and the output of the SEC Qn- Let 

C„ = sup -/(Xr„i;y). 

Then C = lim„^oo C?i = infn>i Cn exists and is equal to the 
transmission capacity of the SEC. Furthermore, 

C = sup lim —I{Xtn]',Y) 

where Xj^4 is a stationary, ergodic, Markov process over X. 

We will consider an example of an SEC and confine our 
attention to this channel throughout this paper 

Example 2 (Deletion-Duplication Channel (DDC)}: 
Consider the binary SEC with X = Y = {0, 1} and the 
following stochastic matrix 



(i{y\x) 



with + E^iPtPr 



y = \ 

y = x"^ , V r > 1 
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ft = (1 - Pd)(l - Pi) for 
Pi < 1. This model implies that deletions and duplications 
occur i.i.d. (and mutually exclusively) with probabilities pd 
and p\ respectively. The expected output string length is 



< 
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1 -Pd 
1-pi 



< OO. 



Hence {p\,Pd) G [0, 1)^. Since the capacity is zero when either 
Pi or pd is 1 (they cannot simultaneously be 1), this model 
does indeed represent the entire class of deletion-duplication 
channels. Note that when p\ = 0, the DDC is the same as the 



Pniy\x[n], Zo ^ 0) ^ ^ P(Z = z,y = 2/|X[„] = a;[„],Zo 0) 
{^:|^I=|17|} 

= J2 P(^ = 2|^o = 0)P(F = y|X[„] =a;[„],Zo = 0,Z = z) 
\y\ 

= ^ Yl y^^^ = = Zo = 0)P{Yi = yi\Xin] = X[n],Zi = Zi) 

{z■.\z\ = \y\}^=l 



(2) 



binary deletion channel (BDC), and when = 0, it is the 
so-called binary sticky channel |2|. 

The BDC has been the most well-studied SEC. In ||3), the 
author surveys the results that were known prior to 2009. 
To summarize, the best known lower bounds were obtained, 
chronologically, through bounds on the cutoff rate for se- 
quential decoding Q, bounding the rate with a first-order 
Markov input [5], reduction to a Poisson-repeat channel f6l, 
analyzing a "jigsaw-puzzle" coding scheme |7|, or by directly 
bounding the information rate by analyzing the channel as a 
joint renewal process |8|. Recently, |9| and |10| independently 
gave the capacity of a BDC with small deletion probabilities, 
and showed that it is achieved by independent and uniformly 
distributed (i.u.d.) inputs. The known upper bounds for the 
BDC have been obtained by genie-aided decoder arguments 
i fTTI , | [T2| . An idea from | fT2| was extended to obtain some 
analytical lower bounds on the capacity of channels that in- 
volve substitution errors as well as insertions or deletions |13|. 
In contrast to these existing results, our approach explicitly 
characterizes the achievable information rates in terms of 
"subsequence-weights", which is a measure relevant in ML 
decoding for the BDC |[3|. Additionally, the method proposed 
here gives the tight bound on capacity for small deletion 
probabilities obtained in |9Jmore directly. 

For the sticky channel, p) obtained lower bounds on the 
capacity by numerically estimating the capacity per unit cost 
of the equivalent channel of runs through optimization of 8 and 
16 bit codes. Here, we obtain direct analytical lower bounds on 
the capacity. These, to the best of our knowledge, represent the 
only analytical bounds for the capacity of the sticky channel. 



III. SEC AS A Channel with States 
For the DDC Qn, we write 



(3) 



where Zi E Z is the "state" of the channel and we define the 
length of the output to be == sup{« > : < n|ro = 
0}. The state process Z is independent of the channel input 
process X, and is a time-homogeneous Markov chain over the 
set of integers Z with transition probabilities 



P(Zj = Zi\Zi_i = Zi_i) 




Zt-l + 1 

- r, r > 



(4) 

where we define pt — (1 — Pd)(l for normalization. We 
also assume the boundary condition that — 0, i.e., that 



there was perfect synchronization initially. It is easy to see 
that Nn < oo V n e N a.s. since we impose p\ < 1, and 
that Nn — > oo a.s. as n — > oo since < I. We refer to 
the F-process as the index process. The index process and 
the channel state process have a one-to-one correspondence, 
and consequently, we will use them interchangeably. From the 
state transition probabilities in (|4]), it is also clear that the Z 
process is shift-invariant, i.e., P{Zi = Zi\Zi-i = = 
P{Zi = Zi — Zi-i\Zi-i = 0). The index process inherits this 
property from the Z process. 

For j7 € Y and X[n] € X", the channel transition probabili- 
ties are given as in Equation (|2|. Note that in the terms within 
the parenthesis on the right hand side of Equation (|2]i, the 
first term is completely specified by the transition probabilities 
Q of the channel state process Z, and the second term is 
or 1 accordingly as yi = Xi^^. or yi ^ x^-z, respectively. 
The input and output alphabets of P„ are X = Y = {0, 1}. 
The equivalence between the DDC Qn and Pn, for any n, 
is evident by noting that for every parsing of y e Y as 
in Equation (|T|i, there is a corresponding state path z e Z 
in Equation (|2| (and vice versa) and that the terms within 
the parenthesis in (|2]i, when grouped according to the output 
symbols arising from the same input symbol, spell out exactly 
the same probability as the terms q{yi\xi). 

As a consequence of the above equivalence. Theorem [T] 
applies to the sequence of channels {PnjJ^i specified by 
Equations (|3]l and Q. We hence have for input and output 

Y[Ar^] of Pn, 

C= Van sup -/(X[„]; YjAr^j) 
"^°°P(X[„,) n 

= sup lim -/(X[„]; Y[jv„]). 

We will restrict our attention to stationary, ergodic, Markov 
sources Xm- Under this assumption, the output process y is 
also stationary, and the entropy rate 'H{y) is well-defined. 

IV. Bounds on Capacity 



With the setting of Section III it is possible to immediately 
obtain some non-trivial bounds on the capacity. We start with 
some simple bounds and bounding techniques for the DDC 
and consider the BDC and the sticky channel in separate 
subsections. 



Proposition 3: For the deletion-duplication channel, 

<C<l-p,, 



where (a;)+ = max{0, a;} and /i2(-) is the binary entropy 
function. 

Proof: We can write 



(a) 



(fc) 



-^(-'^H;^[Ar„]l^[A'„]) - I{^[n]', Z[N„]\Y[N„]) 



(1 -Pd)i^(^[„]) - /(X[„]; V„]|Y[^„]), (5) 

where (a) is true because X JL Z and (6) from the fact that 
the DDC, given the Z process realization, is equivalent to a 
BEC with erasure rate p^. Then, 

nil-pd) > /(X[„]; Y[^„]) > (1 -pd)i/(X[„]) - H{Z[nJ. 
Since the Z process is a Markov chain, we can easily 
show H{Z[nJ < E(7V„)(/i2(Pi) + |5^/i2(Pd)), where the 
inequality follows because, for any finite n, we have the extra 
knowledge that Zi > i — n hy definition. This extra knowledge 
becomes tautological when n — > oo so that 



lim 



( lii 



lim ) [h2{p\) + —h2{pd, 

n \ n-i-oo n / \ 1 ~ Pd 



Writing the F-process as a renewal process, from the strong 
law of large numbers, — — > a.s. as n — ^ oo. ■ 
Note that the above result implies the following for the BDC 
(p; = 0,Pd = p), the symmetric DDC (with p; = Pd = p) and 
the sticky channel (p\ = p,pd — 0) respectively. 

(1 - p-/l2(p))+ < Cbdc < 
(1 - p - 2h2(p))+ < CsDDC < 1 - P, 
/12(P)' 



1 - 



1-p 



< Csticky < 1- 



Although these bounds have simple closed-form expressions, 
they are far from the best known bounds for the capacity of 
these channels. We can, however, improve these bounds. We 
have from Equation (|5]l, 

/(X[„]; V,.]) = (1 -pd)i?(XH) + V.]) 

- H{Z[nJ + HiZ[NjX[^],Y[Nj. (6) 

We can easily show that niZ\X,y) = ^H{Zi\X,y), and 
hence 

HiZi\x,y)^ 

X 



C > sup (uiX) 



)(1-Pd) 

/i2(Pi)-/i2(Pd). (7) 



It is not easy to evaluate the right hand side of the above 
inequality. However, we can lower bound it further by intro- 
ducing some conditioning. 

c > sup n{x) ^ (1 - Pd) 

X ^ 1 - Pi ' 

- ^^/i2(Pi) - /i2(Pd) = supLf V i e N. 

1 -Pi X 



Lemma 4: The sequence {if^li^i is non-decreasing. 
Proof: We have 

H{Z\\Zi^\) = H{Zi,Zi\Zi+i) — H{Zi\Zi, Zi^i) 

HiZ,\Z,+i) + H{Zi\Z,) - HiZ,\Zi,Z,+i) 
= H{Zi\Z,)+I{Zi;Z,\Z,+i) > H{Zi\Z,), 

where (a) follows from the Markovity of the Z process. Since 
conditioning on X and y retains the above chain of inequal- 
ities, we have H{Zi\Z,+i, X ,y) > H{Zi\Z„X,y) V i > 1. 
Hence {Lf}°^^ is non-decreasing, and maximizing over sta- 
tionary, ergodic, Markov input processes X gives the bound in 
Proposition |3] for i = 1. Therefore, for increasing i, we have 
bounds better than the one in Proposition [3] and in the limit 
as i — > oo, we approach the bound in (|7|. ■ 
For the case of the BDC and the sticky channel, evaluating 
some of these bounds is easier, owing to the fact that the Z 
process is monotonic, i.e., in these cases, the output is just a 
subsequence of the input sequence and vice versa respectively. 

A. Information Rates for the BDC 

For the BDC with i.u.d. inputs, we can easily show that y 
is also an i.u.d. sequence. Consequently, I{y; Z) ~Q because 
the only information obtained from ^[Ar^] about ^[Ar„] is the 
length of the vector, and this information vanishes in the limit 
as n — CXI as a result of the concentration. Therefore, we 
have from Equation (|6| that the lower bound in Equation (|7]) 
is actually the symmetric information rate (SIR) in this case. 
Let us denote by Wy^.^{x[j]) the number of subsequences of 
that are the same as and define wx{x\j-\) = 1 V x\jy 
We will refer to Wy^.^(x^j^) as the y^iysubsequence weight of 
the vector 

We will focus on the term H{Zi\Zi,X ,y) which is the 
only term to be evaluated to get an estimate of First 
note that H{Zi\Z,,X,y) - Y[,_i]). 
Given Zi - 



-m, X\ 



X[,n+i-i] and Y[j_i] 



y[i-i], we have Zi e {j e {0, -I,-- - ,-m} : xi-j = 
yi,Wy^^_._-^j{x[2-j:m+i-i]) > 0}. Further, it is easy to see that 

P(Zi = z\Zi = -TO,X[„^_j_i] = X[„i+i_i], = y[i-i]) 



1 



{xi-^=yi}' 



y[2:i 



-1] (2^(2- 



■z:m-^i— 1 



l) 



-m< z <0. 



Since P(X[„+,_i] = = -m) = 2-("+'-i), 

= y[i-l]\X[rn+i-l] = X[„i+i-l],Z, = -m) 
_ '^V[i-l] {^[m+i-1]) 
\ m J 

and P{Zi = —m\ZQ 0) = p®*(— m), where we write 
p(~to) = P(Zi = -m\Zo = 0), p'»'{~m) (p ® 
p^*^^)(— to), p'^^(— to) = p(— to) with (g) denoting convo- 
lution, we have for any i E N 



c > nr = 1 



m>0 



^(-TO)i3«)(l-p)-/i2(p) 



(i) _ I f sr^ (^[m+i-i]) , ,\ 



=yi} " 



(8) 



where Sjm is as given in Equation (|8]l. Unfortunately, evaluat- 
ing Sjm for i > 2 is hard since counting subsequences is not 
easy. For the case of i = 2, we can easily evaluate 

= (l + (1 - P)' E (™ + (1 - P) - h,ip) 

m>0 

(9) 

with 



m+l 



1=0 ^ ^ 

Although evaluating LJ"^ for i > 2 is hard, we can further 
lower bound it as follows. 

Lj"^ = (1 + H{Z,\Z,, X, y)){l -p)- h^ip) 



> (l+ E = -m)HiZi\Z, = -m,X,y) 



X (1 -p) - /l2(p) 

= l-p-/i2(p) + (l-p)af 
= V j > 0,i > 1. 

We can then use C > sup^y^ SL^j'^ = li^"'^ for some > as 
a lower bound for the capacity. We proceed as follows 

of = 1 + p«^(-j) • H(Z,\Z, = -j,X,y) 



(10) 



Since Oq = 6q = 0, we have a]^ — ip{l — pYb-^ . Let us 
denote by ri(a;[„]) the length of the first run in the vector x^^y 
Then, we can show that, for y[i-i] received from x^^^ with a 
single deletion. 



and hence 



V[i-i] '-^ 



(11) 



From ( [Tol l ^nd ( fTT) , 



(i) 



and thus 



C > l+plog2P-cp + 0{p^) 



(12) 



where c = log2(2e) - ^E^yiii^og^j « 1.154163. Note 
that this is exactly the expression obtained for capacity for 
small p in |9|. In the evaluation of the above bound, we were 
helped by the fact that when we restrict to the case of a single 
deletion, the ambiguity in the first channel state Zi arises only 
when ri(x[j]) > 1, in which case the uncertainty is exactly 

ho I — — T I . This, however, is not true when there are 2 or 
more deletions, wherein we will have to count subsequence 
weights of sequences. 

We can obtain similar bounds for symmetric first-order 
Markov input processes. But these calculations will have to 
keep track of ascents and descents in sequences, and are there- 
fore more tedious. We can write for P{Xi ^ x (B l|Xi_i = 
x) = a. 



Ml 



sup (/i2(«) + (1 E (™ + 

m>0 

X (1 — p) — h2{p), where 
7r(Q!, i,m + 1), 



m+l 



^m(a) = log2(m + 1) - E ^2 

and 7r(- • • ) is defined recursively as 

TT{a,i,m) = ^nQ{a,i,m) + ^7ri(a,z,m) 

TTo{a, i, m) = (1 — a)7ro(a, i, m — 1) + a7ri(Q;, « — 1, to — 1) 
T^iioL, i, to) = (1 — a)TTi{a, i — 1, to — 1) + Q;7ro(a, i, to — 1) 

with Trj{a,i,m) = G {0,m},j £ {0,1} and 

TTj{a,i,m) = for « ^ G {0, 1}. We can also evaluate 

£fl=-/j2(p) + (l-p)x 



sup 



* 1 
h2{a)+p- sup(l -p)'(«E J(l - <^y~'h2[j) 



+ i{l-ayh2(^ 



However, both and turn out to be better than their 
SIR counterparts by less than 1%. 

B. Information Rates for the Sticky Channel 

The analysis for the sticky channel is very similar 
to that for the BDC in the previous subsection. Since 
lim„_>oo ^-f(i^[Ar„]; -^[ATn]) 7^ when the input is i.u.d., we 
bound the l-L{Z\y) term (see Equation (|6])) differently. In this 
case, we obtain 

H{Z^\Z,,X,y)-H{Zi\Yi) 



C > sup 



sup {n{x) 

i>l ^ 



l-p 
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Fig. 1. Bounds on the capacity for tlie binary deletion cliannel. ^2'^ in 
j9j is sliown as tlie long-dashed line and S}-^'^ in \\2) (with the O(p^) term 
dropped) as the solid line. The best known numerical lower [8] and upper 
bounds 1 12] are shown as black and white circles respectively. The best known 
lower bound as p approaches 1 | 6 | is shown as the dash-dotted hne. The inset 
plots the bound \\?>\ as the long-dashed line for the sticky channel, and the 
Markov-1 rate as the solid line. The lower bounds from |2J are shown 
as black circles. 



> sup 



h2{a) + sup - py-^H{Zi\z, = i,x,y) 



p+{l-a){l-p) 
l-p 



^^ip+[l-a){l-p)) 



(13) 



where 



i-l 



1 1 

H{Z^\Z, = l,X,y) = -^ ^(j + l)h2 (— ) (1 - ay a 

+ /^2(^)(l-a)^ 

For a = i, we get C > 1 + plogjP + dp — 0{p'^) where 
d = log2(f) + 5Ej>i ^log2J ~ 0.845836. As was the 
case for the BDC, we expect this to be a tight bound for the 
capacity for small p. In fact, for the sticky channel, we can 
exactly characterize the maximum rate achievable by a first- 
order Markov process (7^^ as 

C^i =sup ^h2{a) 

+"i:(a-«)^)'(i:f!VM^) 



r>l ^ 

p+{l-a){l-p) 



s>r 



'ip+{i 



p 



i-p " vp + (1 - - p) y J ^ ^ 

Figure [T] plots all the bounds obtained for BDC and sticky 
channel. 



We note that the Markov-1 rate ( [T4| i is larger than 1 — p 
for a range of p values. This disproves the conjecture that the 
capacity is convex in p for the sticky channel, unUke what is 
expected for the BDC | [T4) . 

V. Conclusions 

The model presented here provides a unified framework to 
handle a broad class of channels with synchronization errors 
over any finite alphabet. For channels with only deletions or 
only duplications, we obtain analytical lower bounds on the 
capacity, including some bounds that are expected to be tight 
for small deletion or duplication probabilities. More generally, 
the model has an immediate factor-graph interpretation, and 
this could potentially be used to explore reliable coding 
schemes. Moreover, it could facilitate the exploration of some 
fundamental theoretical questions, e.g., establishing a coding 
theorem for synchronization error channels with memory. A 
more detailed treatment of some of these questions, results 
for the BDC and sticky channel, and generalizations to other 
channels of interest is in preparation |15). 
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